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Report  Title 

Abductive  Inference  for  Ccombat:  Using  SCARE-S2  to  Find  High-Value  Targets  in  Afghanistan 

ABSTRACT 

Recently,  geospatial  abduction  was  introduced  by  the 
authors  in  (Shakarian,  Subrahmanian,  and  Spaino  2010) 
as  a  way  to  infer  unobserved  geographic  phenomena 
from  a  set  of  known  observations  and  constraints  between 
the  two.  In  this  paper,  we  introduce  the  SCARES2 
software  tool  which  applies  geospatial  abduction 
to  the  environment  of  Afghanistan.  Unlike  previous 
work,  where  we  looked  for  small  weapon  caches  supporting 
local  attacks,  here  we  look  for  insurgent  highvalue 
targets  (HVT’s),  supporting  insurgent  operations 
in  two  provinces.  These  HVT’s  include  the  locations 
of  insurgent  leaders  and  major  supply  depots.  Applying 
this  method  of  inference  to  Afghanistan  introduces 
several  practical  issues  not  addressed  in  previous  work. 

Namely,  we  are  conducting  inference  in  a  much  larger 
area  (24;  940  sq  km  as  compared  to  675  sq  km  in  previous 
work),  on  more  varied  terrain,  and  must  consider 
the  influence  of  many  local  tribes.  We  address  all  of 
these  problems  and  evaluate  our  software  on  6  months 
of  real-world  counter-insurgency  data.  We  show  that 
we  are  able  to  abduce  regions  of  a  relatively  small  area 
(on  average,  under  100  sq  km)  that  are  more  dense  with 
HVT’s  (35  □  more  than  the  overall  area  considered). 
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Abstract 

Recently,  geospatial  abduction  was  introduced  by  the 
authors  in  (Shakarian,  Subrahmanian,  and  Spaino  2010) 
as  a  way  to  infer  unobserved  geographic  phenomena 
from  a  set  of  known  observations  and  constraints  be¬ 
tween  the  two.  In  this  paper,  we  introduce  the  SCARE- 
82  software  tool  which  applies  geospatial  abduction 
to  the  environment  of  Afghanistan.  Unlike  previous 
work,  where  we  looked  for  small  weapon  caches  sup¬ 
porting  local  attacks,  here  we  look  for  insurgent  high- 
value  targets  (HVT’s),  supporting  insurgent  operations 
in  two  provinces.  These  HVT’s  include  the  locations 
of  insurgent  leaders  and  major  supply  depots.  Apply¬ 
ing  this  method  of  inference  to  Afghanistan  introduces 
several  practical  issues  not  addressed  in  previous  work. 
Namely,  we  are  conducting  inference  in  a  much  larger 
area  (24,  940  sq  km  as  compared  to  675  sq  km  in  pre¬ 
vious  work),  on  more  varied  terrain,  and  must  consider 
the  influence  of  many  local  tribes.  We  address  all  of 
these  problems  and  evaluate  our  software  on  6  months 
of  real-world  counter-insurgency  data.  We  show  that 
we  are  able  to  abduce  regions  of  a  relatively  small  area 
(on  average,  under  100  sq  km)  that  are  more  dense  with 
HVT’s  (35  x  more  than  the  overall  area  considered). 

Introduction 

Insurgents  operating  in  Afghanistan  require  a  substantial 
command-and-control  (C2)  and  logistics  support  to  conduct 
successful  attacks.1  Military  analysts  refer  to  elements  that 
provide  C2  and  logistics  support  for  large  number  of  insur¬ 
gent  cells  as  “high-value  targets”  (“HVTs”)  as  the  elimina¬ 
tion  of  these  HVTs  can  have  a  significant  impact  on  insur¬ 
gent  operations.  As  a  result,  NATO  and  Afghan  forces  often 
concentrate  on  finding  these  HVTs  in  an  attempt  to  reduce 
the  level  of  violence  in  the  country.  The  insurgents  have  a 
limited  number  of  these  HVTs  that  are  required  to  support 
the  activities  of  lower-level  insurgent  cells.  Additionally, 
terrain  and  cultural  considerations  place  constraints  on  the 
relationships  between  an  HVT  and  the  lower  level  insurgent 
cell  it  supports.  Knowing  the  locations  of  the  lower-level 

Copyright  ©  2011,  Association  for  the  Advancement  of  Artificial 
Intelligence  (www.aaai.org).  All  rights  reserved. 

'Note  that  throughout  this  paper,  ’attack’  refers  to  an  attack 
conducted  by  the  insurgents  against  coalition  forces. 


cells  (based  on  attack  data),  as  well  as  these  constraints  (ob¬ 
tained  from  socio-cultural  and  terrain  data),  we  wish  to  ab- 
ductively  infer  where  the  HVTs  can  be  found.  This  is  clearly 
an  instance  of  a  geospatial  abduction  problem  originally  in¬ 
troduced  by  the  authors  in  (Shakarian,  Subrahmanian,  and 
Spaino  2010)  and  later  extended  in  (Shakarian  and  Subrah¬ 
manian  2010).  We  previously  applied  geospatial  abduction 
to  find  small  weapons  hide-sites  related  to  local  attacks  in 
Baghdad  in  (Shakarian,  Subrahmanian,  and  Spaino  2009). 
However,  the  environment  of  Afghanistan  provides  several 
challenges  that  we  did  not  address  in  the  other  work.  These 
include  the  following: 

1 .  In  Afghanistan,  the  influence  of  multiple  tribes  affect  re¬ 
lationships  between  areas  on  the  ground.  How  do  we  ac¬ 
count  for  this  influence? 

2.  In  the  two  provinces  we  considered  in  Afghanistan,  the 
terrain  is  extremely  varied,  unlike  the  more  uniform  urban 
terrain  of  Baghdad.  How  do  we  account  for  this  variance 
in  terrain? 

3.  Unlike  our  application  to  Baghdad  (25  x  27  km  area), 
where  we  could  easily  discretize  the  space,  our  data-set 
for  Afghanistan  includes  two  provinces  covering  a  total 
area  580  x  430  km,  making  discretizing  of  the  space  im¬ 
practical.  How  do  we  best  represent  the  space? 

We  note  that  using  only  attack  data  and  socio-cultural  infor¬ 
mation  alone  will  most  likely  be  insufficient  to  pinpoint  a 
HVT.  However,  the  real-world  requirements  imposed  on  the 
insurgents  by  logistic  and  socio-cultural  variables  should  al¬ 
low  a  ground  commander  to  significantly  reduce  the  search- 
space  for  such  targets.  Intelligence  professionals  identify 
“Named  Areas  of  Interest”  or  NAIs  -  regions  on  the  ground 
where  they  think  HVTs  can  be  located.  Then,  other  intel¬ 
ligence  assets,  such  as  unmanned  aerial  vehicles  (UAVs)  or 
tactical  human-intelligence  (HUMINT)  teams  can  be  used 
in  the  NAIs  to  pinpoint  targets.  (US  Army  1994)  In  a  large 
area,  such  as  a  province  of  Afghanistan,  UAVs  or  HUMINT 
cannot  be  used  effectively  without  first  determining  good 
NAIs.  To  address  this  problem  fir  the  specific  case  of 
Afghanistan,  we  adapted  the  region-based  abduction  frame¬ 
work  of  (Shakarian  and  Subrahmanian  2010)  to  our  scenario 
by  creating  an  entirely  new  piece  of  software  for  abduc¬ 
tive  inference  called  the  SCARE-S2  (Spatio-Cultural  Ab¬ 
ductive  Reasoning  Engine  System  2).  SCARE-S2  abduc- 


tively  finds  regions  that  can  then  later  be  used  to  cue  other 
intelligence  assets  to  find  an  HVT.  Applying  SCARE-S2 
to  our  Afghanistan  dataset  produced  regions  with  a  signif¬ 
icantly  higher  density  of  HVTs  (by  a  factor  of  35),  where 
half  of  the  abduced  ground  regions  (normally  of  an  area  less 
than  100fc?n2)  would  contain  at  least  one  HVT.  Due  to  the 
high  density  of  HVTs  within  the  regions,  we  feel  that  they 
could  be  used  for  NAIs  and  aide  in  combat  operations. 

This  paper  is  organized  as  follows.  First,  we  briefly  re¬ 
view  the  region-based  abduction  framework  of  (Shakarian 
and  Subrahmanian  2010)  and  present  some  extensions  we 
used  to  address  our  Afghanistan-specific  problem.  Then  we 
describe  our  dataset  for  Afghanistan.  This  is  followed  by 
a  description  of  our  implementation  along  with  our  experi¬ 
mental  results  and  discussion. 

Region-Based  Geospatial  Abduction 

In  this  section,  we  briefly  review  the  framework  of  (Shakar¬ 
ian  and  Subrahmanian  2010)  -  which  is  not  new  material. 
This  is  followed  by  our  practical,  Afghan-specific  extensions 
in  the  next  section  (which  is  new  in  this  paper). 

We  assume  the  existence  of  a  real-valued  M  x  N  space 
S  whose  elements  are  pairs  of  real  numbers  from  the  set 
[0,  M]  x  [0,  N],  An  observation  is  any  member  of  S.  We  use 
O  to  denote  an  arbitrary,  but  fixed,  finite  set  of  observations. 
We  assume  there  are  real  numbers  a  <  (3  such  that  for  each 
observation  o  ,  there  exists  a  partner  pa  (to  be  found)  whose 
distance  from  o  is  in  the  interval  [ct,/?].2  Without  loss  of 
generality,  we  also  assume  that  all  elements  of  O  are  over  3 
distance  away  from  the  edge  of  S. 

Throughout  this  paper,  we  assume  the  existence  of  a 
distance  function  d  on  S  satisfying  the  usual  properties  of 
such  distance  functions.3  We  now  define  a  region  and  how 
they  relate  to  the  set  of  observations.  Our  intuition  is  simple 
-  a  region  explains  an  observation  if  that  region  contains  a 
partner  point  for  that  observation. 

Region/Region  Explanation 

•  A  region  r  is  a  subset  of  S  such  that  for  any  two  points 
(x,  y),  (x' ,  y')  £  r,  there  is  sequence  a  of  line  segments 
from  (x,  y)  to  (x1 ,  if  )  s.t.  no  line  segment  lies  outside  r. 

•  A  region  r  explains  point  o  in  S  iff  there  exists  a  point 
p  £  r  such  that  d(o,p)  £  [a,  0\ . 

Note  that  regions  can  have  any  shape  and  may  overlap. 
Throughout  this  paper,  we  assume  that  checking  if  some 
point  o  is  explained  by  region  r  can  be  performed  in 
constant  (i.e.  0(1))  time.  This  is  a  reasonable  assumption 
for  most  regular  shaped  regions  like  circles,  ellipses  and 
polygons. 

Region  Explanation  Problem  (REP) 

INPUT:  Given  a  space  S ,  distance  interval  [a,  0\,  set  O 
of  observations,  set  R  of  regions,  and  natural  number 
k£[l,\0\\. 

2(Shakarian.  Subrahmanian.  and  Spaino  2010)  describes  meth¬ 
ods  to  learn  a,  (3  automatically  from  historical  data. 

3d(x,  x)  =  0;  d{x,  y)  =  d(y,  x);  d(x,  y)  +  d(y,  z)  >  d{x,  z). 


OUTPUT:  Set  R'  C  R,  where  \R'\  <  k  and  for  each  o  £  O, 
there  is  an  r  £  R'  s.t.  r  sub-(super-)  explains  o. 

(Shakarian  and  Subrahmanian  2010),  showed  this  deci¬ 
sion  problem  to  be  strongly  NP-complete,  meaning  that 
the  optimization  version  (that  seeks  to  find  an  explanation 
of  minimal  cardinality)  cannot  be  approximated  by  a  fully 
polynomial-time  approximation  algorithm  unless  P==NP. 
However,  the  problem  also  reduces  to  an  instance  of  set- 
cover,  which  means  that  a  solution  can  be  obtained  within 
a  reasonable  approximation  factor  (1  —  lg(/),  where  /  is 
the  maximum  number  of  regions  associated  with  any  given 
observation).  We  have  included  the  algorithm,  GREEDY- 
REP-MC2  from  that  paper.4 


GREEDY-REP-MC2 

INPUT:  Set  O  of  observations,  set  R  of  regions 
OUTPUT:  R'  C  R 

1.  Let  O  =  [JrGR{Or} 

2.  For  each  observation  o  £  O,  let  GRP0  =  {Or  £  0|o  £  Or} 

3.  For  each  observation  o  £  O,  let  REL0  =  {o'  £  0\o'  £ 
UeveGAPo  C>r}  and  let  key°  =  I REL0\ 

4.  Let  O'  =  O,  set  R'  =  0 

5.  While  not  O'  =  0  loop 

(a)  Let  o  be  the  element  of  O  where  key0  is  minimal. 

(b)  Let  the  element  Or  be  the  member  of  GRP0  s.t.  \Or  H  O' \ 
is  maximized. 

(c)  If  there  are  more  than  one  set  Or  that  meet  the  criteria  of 
line  5b.  pick  the  set  w.  the  greatest  cardinality. 

(d)  R'  =  R'  U  r 

(e)  For  each  o'  £  Or  D  O' ,  do  the  following: 

i.  O'  =  O'  -  o' 

ii.  For  each  o"  £  O'  fi  RELai,  key0" - 

6.  Return  R' 


Adaptations  for  Afghanistan.  There  are  two  parts  of  the 
formalism  of  region-based  abduction  that  are  generally  de¬ 
fined  -  the  distance  function  (d)  and  the  set  of  regions  (R). 
In  the  experiments  of  (Shakarian  and  Subrahmanian  2010), 
we  used  a  Euclidean  distance  function  and  we  generated  the 
regions  from  the  REGION-GEN  algorithm  of  that  paper, 
which  discretizes  the  entire  space  (hence,  making  it  imprac¬ 
tical  for  use  here).  Hence,  we  use  d  and  R  as  a  way  to  adapt 
region-based  abduction  to  our  Afghanistan  scenario  and  ad¬ 
dress  each  of  the  three  concerns  outlined  in  the  introduction. 
Our  strategy  is  to  build  a  special  distance  function,  dafgh, 
and  use  this  function  and  the  set  of  observations,  O,  to  gen¬ 
erate  R. 

To  address  the  first  concern,  that  of  multiple  tribes,  as¬ 
sume  we  have  a  set  of  tribes,  T  =  {ti, . . . ,  tm}.  Based  on 
our  data  set,  we  can  assume  we  have  the  following  function 
tribes  :  S  — >  2T  which  takes  a  point  in  the  space  and  returns 


4Due  to  lack  of  space,  we  have  omitted  an  example  illustrating 
how  this  algorithm  works.  We  refer  the  reader  to  (Shakarian  and 
Subrahmanian  2010)  for  such  examples. 


a  set  of  tribes.  Two  points  in  the  space,  pi,p2,  are  tribally- 
related  iff  tribes(pi)  IT  tribes(j>2)  ^  0.  When  we  create  our 
distance  function,  we  will  do  so  in  a  way  to  enforce  this  as 
an  additional  criterion  that  there  must  be  at  least  one  tribe 
that  has  a  presence  in  the  observation  and  partner  location. 
The  idea  here  is  that  an  HVT  must  have  a  tribal-relationship 
with  the  lower-level  cell  conducting  the  attack,  otherwise  the 
two  groups  may  not  have  a  confluence  of  interest. 

To  address  the  second  and  third  concerns,  we  appeal  to 
the  idea  that  the  road  networks  of  Afghanistan  binds  parts  of 
this  varied  country  together.  Such  sentiments  are  echoed  in 
other  work  such  as  (Conover  2010).  So,  for  any  two  villages 
on  the  road  network  (RN,  an  undirected  graph  where  the 
vertices  are  villages)  of  Afghanistan,  we  define  the  function 
spRN  '■  $  x  S  — >  3?  to  return  the  shortest  distance  on  the 
Afghan  road  network  between  the  two  points.  Using  shortest 
path  on  a  road  network  is  also  useful  as  our  attack  and  HVT 
data  were  all  geolocated  by  village.  Hence,  we  put  these 
concepts  together  to  create  our  new  distance  function,  dafgh, 
defined  below. 

,  //-,_/  spRN(p,p')  ifftribes(p)  n  tribes {p1 )  ^0 

afgh\P,P )  oo  otherwise 

We  use  this  function  to  generate  regions  via  the  algorithm 
REGION-GEN-AFGH  -  presented  for  the  first  time  in  this 
paper.  A  practical  improvement  we  introduced  was  in  deter¬ 
mining  the  set  Va  for  each  observation.  We  first  determined 
the  set  V}EUC)  V0  computed  with  a  Euclidean  distance  func¬ 
tion  on  the  interval  [0 ,  f3\  -  as  the  Euclidean  distance  function 
can  be  calculated  much  faster  than  shortest-path.  From  this 
set,  V0  is  determined.  It  should  be  noted  that  the  algorithm 
runs  with  a  complexity  0(K  ■  \0\  ■  T(RN))  where  K  is  a 
constant  bound  on  the  number  of  partners  distance  (3  away 
from  a  given  observation  and  T(RN)  is  the  time  complexity 
to  find  the  shortest  path  between  two  points  in  RN.  Another 
practical  extension  we  added  was  to  the  output  of  GREEDY- 
REP-MC2.  Any  returned  region  over  1000  sq  km  was  not 
included  in  the  output.  Our  intuition  here  is  that  a  region 
so  large  is  not  useful  to  an  analyst  attempting  to  cue  other 
intelligence  assets. 


REGION-GEN-AFGH 

INPUT:  Space  S ,  observations  O,  reals  a,  (3 
OUTPUT:  Set  of  regions  R 

1.  Let  the  road-network,  RN  =  (V  E) 

2.  For  each  o  £  O.  find  the  set  V0  =  {n  G  V\dafgh  (o,v)  G  [a,  13] 

3.  Let  L  =  Uogo  E°r  each  P  £  L  let  Ov  be  the  set  of  obser¬ 
vations  that  can  be  associated  with  it. 

4.  Partition  L  into  subsets,  denoted  L<p/,  where  O'  C  O  and  p  G 
L0'  iff  Op  =  O'. 

5.  For  each  Lqi,  create  region  r  that  is  the  minimum-enclosing 
rectangle  of  all  elements  in  Lqi  .  Add  r  to  R. 

6.  Return  set  R. 


Afghanistan  Data  Set 

Our  data-set  consisted  of  HVTs  and  attack  data  from  the 
Afghan  provinces  of  Hilmand  and  Kandahar  from  January 
-  June  2010  supported  by  tribal  and  road  network  informa¬ 
tion.  Below  we  provide  details  of  the  data-set. 

Provincial  Data.  All  provincial  data,  including  boundaries 
of  provinces  and  districts,  road  networks,  and  village  lo¬ 
cations  were  provided  by  (Afghanistan  Information  Man¬ 
agement  Services  (AIMS)  ).  We  considered  the  Hilmand 
and  Kandahar  provinces,  which  consist  of  29  districts.  The 
road-network  (RN  =  (V,E))  is  an  undirected  graph  of 
30,  304  vertices  (1604  of  which  are  identified  as  villages) 
and  61, 064  edges. 

Attack  Data.  We  used  a  series  of  203  attack  events  in 
Afghanistan  from  the  (National  Counter-Terrorism  Center 
(NCTC)  ).  103  of  these  events  were  from  January-April 
2010  and  used  to  learn  the  [a,  f3\  distance  constraints,  while 
the  remaining  100  attacks  (May-June  2010)  were  used  as 
set  O  of  observations.  We  actually  divided  the  set  of  obser¬ 
vations  into  12  subsets,  Oi  C  02  C  . . .  C  012,  with  each 
subsequent  set  of  observations  containing  5  days  more  worth 
of  attacks  than  the  previous  (i.e.  Oi  was  May  2-6  and  CL 
was  May  2-11).  All  attacks  in  the  WITS  database  were  iden¬ 
tified  by  village  -  corresponding  with  the  AIMS  information 
described  earlier. 

HVT  Data.  We  collected  a  total  of  78  HVTs  based  on  of¬ 
ficial  reports  from  (International  Security  Assistance  Force 
(ISAF)  Afghanistan  ).  These  reports  spanned  January- 
September  2010.  We  used  the  reports  of  January-April  2010 
(27  HVTs)  to  learn  the  [a,  [3]  distance  constraints  and  the  re¬ 
mainder  for  a  ground-truth  comparison  (notice,  this  time  in¬ 
terval  is  greater  than  that  used  for  the  set  of  observations,  as 
an  associated  HVT  with  an  attack  may  not  necessarily  have 
been  located  in  the  same  time  window  described  earlier). 
As  with  the  attack  data,  each  HVT  was  geo-located  by  the 
ISAF  report  with  a  village,  which  corresponded  to  the  AIMS 
information.  We  manually  identified  only  certain  weapons 
caches  and  captured/killed  enemy  personnel  as  HVTs.  Be¬ 
low  we  present  our  criteria  in  Figure  1  -  it  is  based  on  the 
combat  experience  of  one  of  the  authors. 

Tribal  Data.  To  create  the  tribes  function,  we  used  the 
tribal  data  from  (Naval  Postgraduate  School  (NPS) )  that  as¬ 
sociated  districts  in  Afghanistan  with  a  set  of  tribes.  All 
together,  there  were  23  tribes  reported  by  the  NPS. 

Distance  Constraints.  Using  the  simple  algorithm  FIND- 
BOUNDS  of  (Shakarian,  Subrahmanian,  and  Spaino  2010), 
which  essentially  returns  an  upper  and  lower  distance  bound 
on  the  shortest  distance  to  an  HVT  given  a  set  of  attacks,  de¬ 
termined  the  [a,  (3]  bounds  to  be  [0.0, 65.88]  km  based  on  the 
historical  attack  and  HVT  data  from  January-April  2010. 

Experimental  Results 

Setup.  Our  implementation  of  SCARE-S2,  runs  on  a 
Lenovo  T400  ThinkPad  laptop  equipped  with  an  Intel  Core 
2  Duo  T9400  processor  operating  at  2.53  GHz  and  4.0  GB 
of  RAM.  The  computer  was  running  Windows  Vista  64- 
bit  Business  edition  with  Service  Pack  1  installed.  This 
modest  hardware  setup  was  selected  as  deploying  units  to 
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•  Cache  HVTs: 

-  The  cache  contains  3x  or  more  mortar  rounds 

-  The  cache  contains  mortar  tubes 

-  The  cache  contains  3x  or  more  rockets 

-  The  cache  contains  lOx  or  more  grenades 

-  The  cache  contains  5x  or  more  RPG  launchers 

-  The  cache  contains  20x  or  more  RPG  rounds 

-  The  cache  contains  15x  or  more  AK-47’s  (or  other  similar 
rifles) 

-  The  cache  contains  3x  or  more  land-mines 

-  The  cache  contains  "rooms”  full  of  communications 
equipment  (or  "rooms"  full  of  any  type  of  equipment) 

-  The  cache  contains  a  DsHK  or  any  other  anti-aircraft 
weapon  (including  any  number  of  Stinger  missiles) 

-  The  cache  contains  5x  or  more  RPK  machine  guns  (or 
similar  capable  systems  such  as  M60,  M249,  etc.) 

-  The  cache  contains  5x  or  more  sniper  rifles  (such  as  a  Dra- 
gunov) 

•  Personnel  HVTs: 

-  Reported  listed  individual  as  an  insurgent  “commander” 

-  Reported  listed  individual  as  an  insurgent  “sub¬ 
commander” 

-  Reported  listed  individual  as  an  insurgent  “planner” 


Figure  1:  HVT  criteria. 
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Figure  2:  Number  of  attacks  vs.  runtime  (average  over  10 
trials)  and  average  region  area. 


Afghanistan  are  typically  equipped  with  Windows-based 
laptop  systems.  Isolated  command  posts,  with  limited  con¬ 
nectivity  to  a  network  due  to  terrain  restrictions  may  only 
have  access  to  this  limited  computation  power. 

We  implemented  SCARE  in  approximately  4000  lines  of 
Java  code.  Java  Runtime  Environment  (JRE)  Version  6  Up¬ 
date  14  was  used.  The  software  was  developed  with  Eclipse 
version  3.4.0.  We  used  the  JGraphT  library  version  0.81  to 
implement  the  Fibonacci  heap  and  the  graph  structure.  Ad¬ 
ditionally,  BBN  OpenMap  was  used  for  some  of  the  geospa¬ 
tial  methods.  We  also  added  the  capability  to  output  KML 
files  so  that  the  results  could  be  viewed  in  Google  Earth 
-  we  used  Google  Earth  4.3.7284.3916  (beta)  operating  in 
DirectX  mode.  Experimental  results  were  also  collected  in 
CSV-formatted  spreadsheets. 

Runtime  Experiments.  We  examined  runtime  of  the  algo¬ 
rithm  by  running  the  algorithm  on  each  of  the  12  subsets 
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Figure  3:  Number  of  attacks  vs.  number  of  regions  and  HVT 
density. 


of  observations  described  earlier.  We  observed  two  things: 
that  the  relationship  between  runtime  and  number  of  attacks 
was  linear  and  that  the  runtime  of  REGION-GEN-AFGH 
dominated  the  runtime  of  GREEDY-REP-MC2  (which  was 
negligible).  This  is  primarily  the  result  of  the  calculation  of 
the  shortest  path.  As  stated  earlier,  this  relationship  is  linear, 
so  our  result  depicted  in  Figure  2  is  unsurprising. 

Area  of  Regions.  As  with  (Shakarian  and  Subrahmanian 
2010),  we  examine  the  average  area  of  the  regions.  In  gen¬ 
eral,  smaller  regions  are  preferred  and  as  set  O  grows,  the 
regions  should  become  smaller.  In  each  of  the  12  trials,  there 
was  never  more  than  one  region  over  200  sq  km,  and  as  set 
O  increased,  the  average  area  approach  100  sq  km  -  this  is 
exactly  what  we  are  looking  for.  We  plot  the  average  and 
maximum  areas  in  Figure  2.  Note  that  a  few  spikes  in  aver¬ 
age  area  are  directly  related  to  spikes  in  maximum  area  from 
a  few  outliers  produced  on  some  runs.  Note  that  only  a  third 
of  our  runs  produced  a  region  over  200  sq  km.  Although 
even  100  sq  km  may  seem  like  a  large  area,  we  must  con¬ 
sider  the  density  of  villages  -  which  is  what  we  are  attempt¬ 
ing  to  locate.  The  overall  density  of  villages  for  the  entire 
area  considered  was  0.0064. 5  By  the  nature  of  how  the  re¬ 
gions  are  generated,  they  inherently  have  more  villages.  We 
observed  that  when  we  considered  the  entire  set  of  attacks, 
no  region  contained  more  than  8  villages,  with  an  average 
village  density  of  4.3  villages  per  region.  As  such  is  the 
case,  we  feel  that  the  regions  produced  by  SCARE-S2  will 
be  helpful  in  directing  intelligence,  surveillance,  and  recon¬ 
naissance  (ISR)  assets. 

HVTs  Enclosed  by  Regions.  In  Figure  3,  we  plot  the  num¬ 
ber  of  regions  returned  by  each  run,  as  well  as  the  number  of 
regions  that  enclose  at  least  one  HVT  from  the  ground-truth 
set.  Although  the  number  of  regions  increase  with  the  num¬ 
ber  of  attacks  (from  1  to  6),  the  number  of  regions  enclosing 
an  HVT  also  increase  (from  0  to  3).  While  we  should  ex¬ 
pect  that  solutions  with  more  regions  to  enclose  more  HVTs, 
we  must  also  recall  that  the  regions  become  smaller  with 
each  run.  Further,  we  also  examined  HVT  density  (number 
of  HVTs  divided  by  total  area  of  all  regions),  which  also 
increased  with  each  run  (note  we  had  two  outliers,  identi- 


5In  the  newest  version,  SCARE-S2  also  runs  the  geospatial  ab¬ 
duction  algorithm  of  (Shakarian,  Subrahmanian,  and  Spaino  2010) 
which  abduces  points  (villages,  in  this  case).  Hence,  the  output 
now  not  only  included  regions,  but  villages  of  interest  as  well  - 
which  allows  us  to  further  reduce  the  search-space  for  HVTs. 


Figure  4:  Regions  returned  after  considering  attacks  from  Figure  5:  Close-up  view  of  regions  A-C  with  actual  HVTs 
May-June  2010.  plotted. 


lied  in  Figure  3  as  points  A  and  B.  In  these  two  runs,  the 
software  returned  larger  regions  of  size  719.68  sq  km  and 
403.34  sq  km  that  enclosed  a  large  urban  area  where  many 
HVTs  were  found.  Eliminating  these  regions  from  the  solu¬ 
tion  would  eliminate  these  artificial  spikes  in  density).  When 
we  considered  the  entire  two  months  of  observed  attacks,  the 
HVT  density  in  the  regions  was  over  35  times  greater  than 
the  overall  HVT  density  in  the  provinces.  We  remind  the 
reader  that  the  the  regions  are  meant  to  be  used  as  Named 
Areas  of  Interest  (NAIs)  for  use  by  intelligence  personnel. 
These  NAIs  would  then  be  used  to  cue  other  intelligence 
assets  (for  example,  a  UAV  or  a  HUMINT  team)  to  con¬ 
duct  a  more  fine-grained  search  (hence,  avoiding  a  search 
in  a  larger  area).  Therefore,  despite  only  half  the  returned 
SCARE  regions  containing  NAIs,  the  small  size  of  the  re¬ 
gions,  along  with  the  high  density  of  HVTs,  make  them  in¬ 
valuable  for  the  intelligence  process. 

Discussion.  We  shall  now  consider  our  final  run  of  the  algo¬ 
rithm,  where  we  considered  the  entire  set  of  100  attacks  from 
May-June  2010.  This  run  produced  the  most  regions  enclos¬ 
ing  HVTs,  the  greatest  HVT  density  (discounting  spikes  A 
and  B),  and  the  smallest  average  region  area. 

This  trial  of  the  software  produced  6  regions,  labeled  A-F, 
shown  in  Figure  4.  Half  of  them  enclosed  an  HVT.  However, 
there  were  other  ISAF  reports  that  did  not  include  village 
information.  We  did  not  consider  these  additional  reports  in 
any  part  of  our  experiments.  However,  all  three  regions  re¬ 
turned  by  this  experiment  that  did  not  enclose  an  HVT  were 
located  in  districts  where  an  HVT  was  reported  (with  no  vil¬ 
lage  information).  For  region  D,  there  were  11  such  reports, 
for  region  F,  there  were  4  such  reports,  and  for  district  E 
there  were  was  one  such  report.  Fet  us  now  consider  the 
HVTs  found  within  regions  A-C,  depicted  in  Figure  5.  Re¬ 
gion  A  (with  an  area  of  102.5  sq  km)  encloses  the  village 
of  Bahram  in  the  Ghorak  district  of  Kandahar.  According  to 
ISAF  PAO  report  2010-05-CA-052,  on  May  5,  2010,  a  com¬ 
bined  ISAF-Afghan  force  captured  a  Taliban  commander  in 
this  village,  who  was  responsible  for  several  improvised  ex¬ 


plosive  device  (IED)  attacks  as  well  as  movement  of  foreign 
fighters  in  the  country.  He  also  had  a  cache  that  included 
automatic  rifles  and  heroin.  Region  B  (with  an  area  of  72.0 
sq  km)  encloses  the  village  of  Makuan,  in  the  Zhari  district 
of  Kandahar.  According  to  ISAF  PAO  report  2010-07-CA- 
11,  on  July  18,  2010,  a  combined  Afghan-ISAF  force  con¬ 
ducted  a  raid  on  a  compound  where  a  Taliban  weapons  fa¬ 
cilitator  was  believed  to  reside.  The  unit  received  fire  from 
insurgents,  and  returned  fire  killing  several  of  them.  As  they 
approached  the  compound,  they  found  several  IED’s  placed 
to  guard  the  facility.  The  compound  was  found  to  be  a  IED 
factory  and  a  bunker  system  that  contained  munitions.  Re¬ 
gion  C  (with  an  area  of  71.0  sq  km)  encloses  the  village 
Kharotan  in  the  Nahri  Sarraj  district  of  Hilmand.  ISAF  PAO 
report  2010-08-CA-161  describes  how  ISAF  forces  detained 
the  Taliban  deputy-commander  of  the  Fashkar  Gah  district 
there  on  August  14,  2010. 

Related  Work  and  Conclusions 

Recently,  there  has  been  some  work  dealing  with  analyti¬ 
cal  and  computational  methods  for  reducing  the  IED  threat 
in  a  counter-insurgency  environment.  (Marks  2009)  uses 
dynamic  programming  scheme  to  determine  optimal  path 
on  a  network  to  conduct  route-clearing  operations.  (Curtin 
2009)  explores  the  use  of  linear-referencing  to  associate  IED 
events  to  certain  parts  of  a  road  network.  In  (Fi  et  al. 
2009),  the  authors  introduce  the  PITS  system  for  predict¬ 
ing  IED  events  based  on  geographic  features  and  other  non¬ 
geographic  event  (such  as  time).  (Benigni  and  Furrer  2008) 
the  authors  look  to  quantify  the  IED  threat  of  a  given  route 
at  specific  times  of  day.  We  would  like  to  point  out  that  all  of 
this  work  deals  with  the  either  the  prediction  of  IED  attacks 
or  avoiding  potential  locations  of  IED  attacks  -  not  locating 
HVTs  (enemy  personnel  or  logistics  sites).  Additionally,  our 
search  for  HVTs  is  at  a  much  larger  scale  -  we  are  consid¬ 
ering  whole  provinces  of  a  very  large  area.  Hence,  the  neu¬ 
tralization  of  the  HVTs  associated  with  an  area  of  this  size 
has  a  greater  effect  on  the  battlefield.  To  our  knowledge,  this 


paper  introduces  the  first  computational  method  for  finding 
HVTs  on  the  counter-insurgency  battlefield. 

As  stated  above,  this  works  build  on  the  concept  of 
geospatial  abduction  introduced  in  (Shakarian,  Subrahma- 
nian,  and  Spaino  2009;  2010;  Shakarian  and  Subrahma- 
nian  2010).  However,  none  of  these  papers  consider  ap¬ 
plying  geospatial  abduction  to  the  Afghanistan  scenario  as 
presented  here,  or  the  special  considerations  already  dis¬ 
cussed.  Geospatial  abduction  is  a  form  of  abductive  infer¬ 
ence,  first  introduced  in  (Peirce  1955).  Two  major  existing 
theories  of  abduction  include  logic -based  abduction  (Eiter 
and  Gottlob  1995)  and  set-covering  abduction  (Bylander  et 
al.  1991).  Geospatial  abduction  is  related  to  set-covering 
abduction  (which  has  been  extensively  explored  in  its  appli¬ 
cation  to  medical  diagnosis  in  (Peng  and  Reggia  1990))  as  it 
reduces  to  an  instance  of  set-cover.  Some  instances  of  other 
problems  such  as  facility  location  and  clustering  can  actu¬ 
ally  be  encoded  in  a  geospatial  abduction,  but  a  reduction  in 
the  opposite  direction  is  not  possible  (see  (Shakarian,  Sub- 
rahmanian,  and  Spaino  2010)  for  a  detailed  discussion  on 
this  comparison). 

In  this  paper  we  introduced  a  piece  of  software  called 
“SCARE-S2”  that  applies  geospatial  abduction  to  the  envi¬ 
ronment  of  Afghanistan.  Unlike  previous  work,  where  we 
looked  for  small  weapon  caches  supporting  local  attacks, 
here  we  looked  for  insurgent  high-value  targets  (HVTs), 
supporting  insurgent  operations  in  two  provinces.  These 
HVTs  included  the  locations  of  insurgent  leaders  and  ma¬ 
jor  supply  depots.  Applying  this  method  of  inference 
to  Afghanistan  introduced  several  practical  issues  not  ad¬ 
dressed  in  previous  work.  Namely,  we  are  conducting  in¬ 
ference  in  a  much  larger  area  (24, 940  sq  km  as  compared 
to  675  sq  km  in  previous  work),  on  more  varied  terrain, 
and  must  consider  the  influence  of  many  local  tribes.  We 
address  all  of  these  problems  and  evaluate  our  software  on 
6  months  of  real-world  counter-insurgency  data.  We  show 
that  we  are  able  to  abduce  regions  of  a  relatively  small 
area  (on  average,  under  100  sq  km)  that  are  more  dense 
with  HVTs  (35  x  more  than  the  overall  area  considered). 
There  are  other  possible  uses  of  geospatial  abduction,  in¬ 
cluding  counter-drug,  police,  and  naturalist  uses.  In  our 
lab,  we  are  also  collecting  data  concerning  illegal  mining 
operations  in  Africa  and  are  considering  geospatial  abduc¬ 
tion  as  a  possible  tool  to  explore  this  international  prob¬ 
lem.  Some  of  these  have  been  described  as  examples  in 
work  such  as  (Shakarian,  Subrahmanian,  and  Spaino  2010; 
Shakarian  and  Subrahmanian  2010),  but  not  explored  from 
an  implementation  standpoint.  Such  future  studies  would 
highlight  other  practical  issues  to  consider  when  applying 
geospatial  abduction  to  real-world  problems,  as  was  done  in 
this  paper  for  the  Afghan  scenario.  One  such  practical  ex¬ 
tension  we  are  considering  is  the  use  social  network  data  to 
relate  observations  and  partners  (as  we  did  in  this  work  with 
tribal  data),  which  could  aide  in  predictions. 
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